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Abstract 

We give a 2-approximation algorithm for Non-Uniform Sparsest Cut that runs in time n°^ k \ 
where k is the treewidth of the graph. This improves on the previous 2 2 -approximation in time 
(N| poly(n)2°( fc ) due to Chlamtac et al. [CKR10]. 

To complement this algorithm, we show the following hardness results: If the Non-Uniform Spars- 
er est Cut problem has a p-approximation for series-parallel graphs (where p > 1), then the MaxCut 
problem has an algorithm with approximation factor arbitrarily close to 1/p. Hence, even for such 
restricted graphs (which have treewidth 2), the Sparsest Cut problem is NP-hard to approximate 
better than 17/16 - e for e > 0; assuming the Unique Games Conjecture the hardness becomes 
1/olqw — £■ For graphs with large (but constant) treewidth, we show a hardness result of 2 — e 
J/} assuming the Unique Games Conjecture. 

£^ Our algorithm rounds a linear program based on (a subset of) the Sherali- Adams lift of the 

standard Sparsest Cut LP. We show that even for treewidth-2 graphs, the LP has an integrality gap 
q close to 2 even after polynomially many rounds of Sherali- Adams. Hence our approach cannot be 

1-1 improved even on such restricted graphs without using a stronger relaxation. 

> 1 Introduction 

The Sparsest Cut problem takes as input a "supply" graph G = (V, Eg) with positive edge capacities 
y—i {cap e } eg £; G , and a "demand" graph D = (V,Ed) (on the same set of vertices V) with demand values 
{derrieleg^p, and aims to determine 

o 
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^ where dc(S) denotes the edges crossing the cut (S, V \ S) in graph G. When E D = ( v 2 ) with derrip = 1, 
the problem is called Uniform Demands Sparsest Cut, or simply Uniform Sparsest Cut. Our results all 
hold for the non-uniform demands case. 

The Sparsest Cut problem is known to be NP-hard due to a result of Matula and Shahrokhi [MS90], 
even for unit capacity edges and uniform demands. The best algorithm for Uniform Sparsest Cut on 
general graphs is an 0(\/logn)- approximation due to Arora, Rao, and Vazirani [ARV09]; for Non- 
Uniform Sparsest Cut the best factor is O ( \/log n log log n) due to Arora, Lee and Naor [ALN08]. An 
older 0(\/log n)-approximation for Non-Uniform Sparsest Cut is known for all excluded-minor families of 
graphs [Rao99] , and constant-factor approximations exist for more restricted classes of graphs [GNRS04, 
CGN + 06, CJLV08, LR10, LS09, CSW10]. Constant-factor approximations are known for Uniform 
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Sparsest Cut for all excluded-minor families of graphs [KPR93, Rab03]. [GS13] give a (1 + ^-approx- 
imation algorithm for non-uniform Sparsest Cut that runs in time depending on generalized spectrum 
of the graphs (G,D). All above results, except [GS13], consider either the standard linear or SDP 
relaxations. The integrality gaps of convex relaxations of Sparsest Cut are intimately related to questions 
of embeddability of finite metric spaces into l x \ see, e.g., [LLR95, GNRS04, KV05, KR09, LN06, CKN09, 
LS11, CKN11] and the many references therein. Integrality gaps for LPs/SDPs obtained from lift- 
and-project techniques appear in [CMM09, KS09, RS09, GSZ12]. [GNRS04] conjectured that metrics 
supported on graphs excluding a fixed minor embed into t\ with distortion O(l) (depending on the 
excluded minor, but independent of the graph size); this would imply 0(l)-approximations to Non- 
Uniform Sparsest Cut on instances (G, D) where G excludes a fixed minor. This conjecture has been 
verified for several classes of graphs, but remains open (see, e.g., [LS09] and references therein). 

The starting point of this work is the paper of Chlamtac et al. [CKR10], who consider non-uniform 
Sparsest Cut on graphs of treewidth k. 1 They ask if one can obtain good algorithms for such graphs 
without answering the [GNRS04] conjecture; in particular, they look at the Sherali- Adams hierarchy. In 
their paper, they give an 2 2fc -approximation in time poly(n) 2°^ by solving the /c-round Sherali- Adams 
linear program and ask whether one can achieve an algorithm whose approximation ratio is independent 
of the treewidth k. We answer this question in the affirmative. 

Theorem 1.1 (Easiness) There is an algorithm for the Non-Uniform Sparsest Cut problem that, given 
any instance (G,D) where G has treewidth k, outputs a 2 -approximation in time n 0<yk \ 

Graphs that exclude some planar graph as a minor have bounded treewidth, and fZ-minor-free graphs 
have treewidth 0(\H\ 3 ^ 2 y/n). This implies a 2-approximation for planar-minor-free graphs in poly- 
time, and for general minor-free graphs in time 2°(v"). In fact, we only need G has a recursive vertex 
separator decomposition where each separator has k vertices for the above theorem to apply. 

Our algorithm is also based on solving an LP relaxation, one whose constraints form a subset of the 
0(k log n)-round Sherali- Adams lift of the standard LP, and then rounding it via a natural propagation 
rounding procedure. We show that further applications of the Sherali- Adams operator (even for a 
polynomial number of rounds) cannot do better: 

Theorem 1.2 (Tight Integrality Gap) For every e > 0, there are instances (G,D) of the Non- 
Uniform Sparsest Cut problem with G having treewidth 2 (a.k.a. series-parallel graphs) for which the 
integrality gap after applying r rounds of the Sherali-Adams hierarchy still remains 2 — e, even when 
r = n" 5 for some constant 5 = 5(e) > 0. 

This result extends the integrality gap lower bound for the basic LP on series-parallel graphs shown by 
Lee and Raghavendra [LR10], for which Chekuri, Shepherd and Weibel gave a different proof [CSW10]. 

On the hardness side, Ambiihl et al. [AMS11] showed that if Uniform Sparsest Cut admits a PTAS, 
then SAT has a randomized sub-exponential time algorithm. Chawla et al. [CKK + 06] and Khot and 
Vishnoi [KV05] showed that Non-Uniform Sparsest Cut is hard to approximate to any constant factor, 
assuming the Unique Games Conjecture. The only Apx-hardness result (based onP/ NP) for Non- 
Uniform Sparsest Cut is recent, due to Chuzhoy and Khanna [CK09, Theorem 1.4]. Their reduction 
from MaxCut shows that the problem is Apx-hard even when G is i^2,n, and hence of treewidth or even 
pathwidth 2. (This reduction was rediscovered by Chlamtac, Krauthgamer, and Raghavendra [CKR10].) 
We extend their reduction to show the following hardness result for the Non-Uniform Sparsest Cut 
problem: 

Theorem 1.3 (Improved NP-Hardness) For every constant e > 0, the Non-Uniform Sparsest Cut 
problem is hard to approximate better than y| — e unless P = NP and hard to approximate better than 

1 We emphasize that only the supply graph G has bounded treewidth; the demand graphs D are unrestricted. 
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1/acw ~ e assuming the Unique Games Conjecture, even on graphs with treewidth 2 (series-parallel 
graphs). 

Our proof of this result gives us a hardness-of-approximation that is essentially the same as that for 
MaxCut (up to an additive e loss). Hence, improvements in the NP-hardness for MaxCut would 
translate into better NP-hardness for Non-Uniform Sparsest Cut as well. 

If we allow instances of larger treewidth, we get a Unique Games-based hardness that matches our 
algorithmic guarantee: 

Theorem 1.4 (Tight UG Hardness) For every constant e > 0, it is UG-hard to approximate Non- 
Uniform Sparsest Cut on bounded treewidth graphs better than 2 — e. I.e., the existence of a family 
of algorithms, one for each treewidth k, that run in time and give (2 — e)- approximations for 

Non-Uniform Sparsest Cut would disprove the Unique Games Conjecture. 

1.1 Other Related Work 

There is much work on algorithms for bounded treewidth graphs: many NP-hard problems can be solved 
exactly on such graphs in polynomial time (see, e.g., [RS86]). Bienstock and Ozbay [BO04] show, e.g., 
that the stable set polytope on treewidth-/c graphs is integral after k levels of Sherali-Adams; Magen 
and Moharrami [MM09] use their result to show that 0(1 /e) rounds of Sherali-Adams are enough to 
(1 + e)-approximate stable set and vertex cover on minor- free graphs. Wainwright and Jordan [WJ04] 
show conditions under which Sherali-Adams and Lasserre relaxations are integral for combinatorial 
problems based on the treewidth of certain hypergraphs. In contrast, our lower bounds show that the 
Sparsest Cut problem is Apx-hard even on treewidth-2 supply graphs, and the integrality gap stays 
close to 2 even after a polynomial number of rounds of Sherali-Adams. 

2 Preliminaries and Notation 

We use [n] to denote the set {1,2,..., n}. For a set A and element i, we use A + i to denote A U {i}. 

2.1 Cuts and MaxCut Problem 

All the graphs we consider are undirected. For a graph G = (V,E) and set SCF, let Bq(S) be the 
edges with exactly one endpoint in S; we drop the subscript when G is clear from context. Given 
vertices V and special vertices s,t, a cut (A,V \ A) is s-i-separating if \A n {s, t}\ = 1. 

In the (unweighted) MaxCut problem, we are given a graph G = (V, E) and want to find a set SCI/ 
that maximizes |9g(5)|; the weighted version has weights on edges and seeks to maximize the weight on 
the crossing edges. The approximability of weighted and unweighted versions of MaxCut differ only 
by an (1 + o(l))-factor [CST01], and henceforth we only consider the unweighted case. 

2.2 Tree Decompositions and Treewidth 

Given a graph G = (V,Eq), a tree decomposition consists of a tree T = (X,Ex) and a collection 
of node subsets {Ui C V}i e x called "bags" such that the bags containing any node v 6 V form a 
connected component in T and each edge in Eg lies within some bag in the collection. The width of 
such a tree decomposition is maxj g x (\Ui\ — 1), and the treewidth of G is the smallest width of any 
tree-decomposition for G. See, e.g., [DieOO, Bod98] for more details and references. 

The notion of treewidth is intimately connected to the underlying graph G having small vertex separa- 
tors. Indeed, say graph G = (V,E) admits (weighted) vertex separators of size K if for every assignment 
of positive weights to the vertices V, there is a set X C V of size at most K such that no component of 
G — X contains more than | of the total weight X^eV Wv - ^ or exam pl e > planar graphs admit weighted 
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vertex separators of size at most \Jn. It is known (see, e.g., [Ree92, Theorem 1]) that if G has treewidth 
k then G admits weighted vertex separators of size at most k + 1; conversely, if G admits weighted 
vertex separators of size at most K then G has treewidth at most AK. (The former statement is easy. 
A easy weaker version of the latter implication with treewidth 0(K log n) is obtained as follows. Find 
an unweighted vertex separator X C V of G of size K to get subgraphs G%, G2, ■ ■ ■ , C?t each with at 
most 2/3 of the nodes. Recurse on the subgraphs G{ U X to get decomposition trees 71, Attach 
a new empty bag U and connecting U to the "root" bag in each % to get the decomposition tree T, add 
the vertices of X to all the bags in T, and designate U as its root. Note that T has height 0(log n) and 
width O(Klogn). In fact, this tree decomposition can be used instead of the one from Theorem 3.1 for 
our algorithm in Section 3 to get the same asymptotic guarantees.) 

2.3 The Sherali-Adams Operator 

For a graph with \V\ = n, we now define the Sherali-Adams polytope. We can strengthen an LP 
by adding all variables x(S,T) such that \S\ < r and TCS. The variable x(S,T) has the "intended 
solution" that the chosen cut (A, A) satisfies ^nS* = T. 2 We can then define the r-round Sherali-Adams 

polytope (starting with the trivial LP), denoted SA r (n), to be the set of all vectors (y U v)u,veV £ 
satisfying the following constraints: 

Vuv = x({u, v},{u}) +x({u, v},{v}) Vu,veV (2.1) 
Y,x(S,T) = 1 VS C V s.t. \S\ < r (2.2) 

TCS 

x(S,T) = x(S + u,T) + x(S + u,T + u) V5 C V s.t. |5| < r - 1, T C S, u <£ S (2.3) 
x(S,T) > V5 C V s.t. \S\ < r,T C 5 (2.4) 

We will refer to (2.3) as consistency constraints. These constraints immediately imply that the x(S,T) 
variables satisfy the following useful property: 

Lemma 2.1 For every pair of disjoint sets S,S' C V such that \S U S'\ < r and for any T C S , we 
have: 

x{S,T) = ^ x{SuS',TUT') 

T'CS' 

Proof. This follows by repeated use of (2.3). ■ 
We can now use SA r (n) to write an LP relaxation for an instance G = (V, E) of MaxCut: 

max ^ 

yuv 

{u,v)eE (2.5) 
s.t. y uv £ SA r (n) \/u,v G V 
We can also define an LP relaxation for an instance (G, D) of Non-Uniform Sparsest Cut: 

mm — j 

E( u ,v)eE D femuvyuv (2.6) 

s.t. y uv G SA r (n) Vu, v £ V 
Note that the Sparsest Cut objective function is a ratio, so this is not actually an LP as stated. Instead, 
we could add the constraint Y!,(u,v)eE D ^ em uvy U v > a, minimize Y,(u,v)eE ca 

PuvVuvj and use binary 

search to find the correct value of a. In Section 3, we will use (a slight weakening of) this relaxation in 
our approximation algorithm for Sparsest Cut on bounded-treewidth graphs, and in Section 6 we will 
show that Sherali-Adams integrality gaps for the MaxCut LP (2.5) can be translated into integrality 
gaps for the Sparsest Cut LP (2.6). 



In some uses of Sherali-Adams, variables xs,t are intended to mean that A n (S U T) = S — this is not the case here. 
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3 An Algorithm for Bounded Treewidth Graphs 



In this section, we present a 2-approximation algorithm for Sparsest Cut that runs in time n °( treemdth ) _ 
Consider an instance (G,D) of Sparsest Cut, where G has treewidth k' , but there are no constraints 
on the demand graph D. We assume that we are also given an initial tree-decomposition (T 7 = 
(X',Ex')', {U[ Q V | i E X'}) for G. This is without loss of generality, since such an tree-decomposition 
X" can be found, e.g., in time 0(n k +2 ) [ACP87] or time 0(n)-exp(poly(£/)) [Bod96]; a tree-decomposition 
of width 0(k' log A/) can be found in poly(n) time [AmilO]. 

3.1 Balanced Tree Decompositions and the Linear Program 

We start with a result of Bodlaender [Bod89, Theorem 4.2] which converts the initial tree decomposition 
into a "nice" one, while increasing the width only by a constant factor: 

Theorem 3.1 (Balanced Tree Decomp.) Given graph G = (V,Eq) and a tree decomposition (T' = 
(X', Ex>)', {U 1 , C V | i E X'}) for G with width at most k' , there is a tree decomposition (T = 
(X, Ex); {Ui C V | i E X}) for G such that 

(a) T is a binary tree of depth at most A := 2[log 5 / 4 (2n)] , and 

(b) maxj g x \ Ui\ is at most k := 3k' + 3 ; and hence the width is at most k — 1. 

Moreover, given G and T' , such a decomposition T can be found in time 0(n). 

From this point on, we will work with the balanced tree decomposition T = (X, Ex), whose root node 
is denoted by r E X. Let P ra denote the set of nodes on the tree path in T between nodes a,r E X 
(inclusive), and let V a = Ub£P ra Ub be the union of the bags Ub's along this r-a tree path. Note that 
\V a \ < k ■ X. 

Recall the Sherali- Adams linear program (2.6), with variables x(S,T) for T C S having the intended 
meaning that the chosen cut (A, A) satisfies A n S = T. We want to use this LP with the number of 
rounds r being max ag x 2|V^|, but solving this LP would require time n °( klo s n ) ; which is undesirable. 
Hence, we write an LP that uses only some of the variables from (2.6). Let S a denote the power set of 
V a . Let S a b be the power set of V a U Vf, and let S := ^a^xSab- For every set S £ S, and every subset 
T C S, we retain the variable x(S, T) in the LP, and drop all the others. There are at most poly(n) 
nodes in X , and hence poly(n) sets S a b, each of these has at most 2 2kX = many sets. This results 

in an LP with n ^ variables and a similar number of constraints. 

Finally, as mentioned above, to take care of the non-linear objective function in (2.6), we guess the 
optimal value a* > of the denominator, and add the constraint 

dem uv y uv > a* 

(u,v)£E D 

as an additional constraint to the LP, thereby just minimizing v)eE G ca PuvUuv For the rest of the 
discussion, let (x, y) be an optimal solution to the resulting LP. 

3.2 The Rounding Algorithm 

The rounding algorithm is a very natural top-down propagation rounding procedure. We start with the 
root r E X; note that V r = U r in this case. Since Ylscv x (Yr->T) = 1 by the constraints (2.2) of the 
LP, the x variables define a probability distribution over subsets of V r . We sample a subset A r from 
this distribution. 

In general, for any node a E X with parent b, suppose we have already sampled a subset for each of 
its ancestor nodes b, ■ ■ ■ ,r, and the union of these sampled sets is Ab C VJ,. Now, let B a = {A' C V a \ 
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A'f] V b = Af,}; i.e., the family of subsets of V a whose intersection with V b is precisely A b . By Lemma 2.1, 
we have 

x(V b ,A b ) = Y, x(V a ,A'). 
A'eBa 

Thus the values x(V a , A')/x(V b , A b ) define a probability distribution over B a . We now sample a set A a 
from this distribution. Note that this rounding only uses sets we retained in our pared-down LP, so we 
can indeed implement this rounding. Moreover, this set A a D A b . Finally, we take the union of all the 
sets 

A := U aeX A a , 

and output the cut (A, A). The following lemma is immediate: 

Lemma 3.2 For any a £ X and any S £ S a , we get Pr[(A D S) = T] = x(S, T) for all T C S. 

Proof. First, we claim that Pr[yl a = T] = x(V a ,T) for all a £ X. This is a simple induction on the 
depth of a: the base case is directly from the algorithm. For b£I with parent node b, 

Pv[A a = T] = Pv[A b = T n V b ] ■ ?i[A a = T\A b = TnV b } = x(V b , T n V b ) • = x(V a , T), 

x{v b ,Tr\V b ) 

as claimed. Now we prove the statement of the lemma: Since S C V a , we know that Pt[A n S = T] = 
Pr[A a n S = T], because none of the future steps can add any other vertices from V a to A. Moreover, 

Pr[,4 a nS = T]= Pr[A. = TUT']= x(V a ,TUT'), 

T'CV a \S T'CV a \S 

the last equality using the claim above. Defining S' := V a \ S, this equals Y^t'cs' U S",T U T'), 
which by Lemma 2.1 equals x(S,T) as desired. ■ 

Lemma 3.3 The probability of an edge (u,v) £ Eq being cut by (A, A) equals y uv . 

Proof. By the properties of tree-decompositions, each edge (u, v) £ Eg lies within U a for some a £ X, 
and {u, v} C S a . The probability of the edge being cut is 

Pt[A n {u, v} = {u}} + Pr[,4 n {u, v} = {v}] = x({u, v}, {u}) + x({u, v}, {v}) = y uv . 

The first equality above follows from Lemma 3.2, and the second from the definition of y uv in (2.1). ■ 

Thus the expected number of edges in the cut (A, A) equals the numerator of the objective function. 

Lemma 3.4 The probability of a demand pair (s,t) £ Ed being cut by (A, A) is at least y s t/2- 

Proof. Let a, b £ X denote the (least depth) nodes in T such that s £ U a and t £ U b respectively; 
for simplicity, assume that the least common ancestor of a and b is r. (An identical argument works 
when the least common ancestor is not the root.) We can assume that r ^ {a,b}, or else we can use 
Lemma 3.2 to claim that the probability s,t are separated is exactly y s t. 

Consider the set V a U V b , and consider the set-valued random variable W (taking on values from the 
power set of V a U V b ) defined by Pr[W = T] := x(V a U V b , T). Denote the distribution by V ab , and note 
that this is just the distribution specified by the Sherali- Adams LP restricted to V a U V b . Let X s and Xt 
denote the indicator random variables of the events {s £ W} and {t £ W} respectively; these variables 
are dependent in general. For a set T C V r , let X s \j< and X t \j> be indicators for the corresponding events 
conditioned on W n V r = T. Then by definition, 

y st = Pr [X s / X t ] = E T Pr [X s]T / X t]T ] (3.7) 
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where the expectation is taken over outcomes of T = W n V r . 

Let T> denote the distribution on cuts defined by the algorithm. Let Y s and Y% denote events that 
{s E ^4} and {t G ^4} respectively, and let Y S \ T and Y t \ T denote these events conditioned on AC\ V r = T. 
Thus the probability that s and t are separated by the algorithm is 

alg(M) = VrlYs + Y t ] = E T Pr[Y s{T ± Y t]T ] (3.8) 

where the expectation is taken over the distribution of T = A n V r ; by Lemma 3.2 this distribution is 
the same as that for W D V r . 

It thus suffices to prove that for any T, 

Pr[A> ± X t]T ] < 2Pr[Y slT + Y t]T }. (3.9) 

Now observe that Y S \ T is distributed identically to X S \ T (with both being 1 with probability x<yVr ^^ '^ A ^ ) , 
and similarly for Y t \j> and X t \j<. However, since s and t lie in different subtrees, Y s \t and Y t \x are inde- 
pendent, whereas X s \j> and X t \j> are dependent in general. 

We can assume that at least one of Ep a6 [X s |j'],Ex) a6 [X t |y] is at most 1/2; if not, we can do the following 
analysis with the complementary events E^^X^y], Ex^JXjiy], since (3.9) depends only on random 
variables being unequal. Moreover, suppose 

E Vab [X tlT ]<E Vab [X slT ] 

(else we can interchange s,t in the following argument). Define the distribution T>' where we draw 
X s \T,X t \T from Vab, set Y S \ T equal to X S \ T , and draw Y t \ T independently from V. By construction, the 
distributions of X S \ T , X t \ T in V st and V are identical, as are the distributions of Y S \ T , Y t \ T in V and V . 
We claim that 

E v ,[X tlT ± Y t]T ] < Ejy[X t \ T ^ Y t]T }. (3.10) 

Indeed, if Ex>'pC,|x] = a and Ex>'[JQ|t] = b, then Ep'^iy] = b as well, with b < a and b < 1/2. 
Thus, (3.10) claims that 26(1 — b) < a(l — b) + 6(1 — a) (recall here that Y t \ T is chosen independently 
of the other variables). This holds if 6(1 — 26) < a(l — 26), which follows from our assumptions on a, 6 
above. Finally, 

PrLY s , r / X t]T ] < Pr[X slT / Y t]T ] + Pr[X t]T + Y t]T }. (3.11) 
Combining (3.10) and (3.11) and observing that X s \j> = Y s ^ in our construction, the claim follows. ■ 

By Lemmas 3.3 and 3.4, a random cut (A, A) chosen by our algorithm cuts an expected capacity of 
exactly ^2 uv£ e g ca PuvVuv: whereas the expected demand cut is at least \ J2 s t£E D ^ em stVst- This shows 
the existence of a cut in the distribution whose sparsity is within a factor of two of the LP value. Such a 
cut can be found using the method of conditional expectations; we defer the details to the next section. 
Moreover, the analysis of the integrality gap is tight: Section 6 shows that for any constant 7 > 0, the 
Sherali- Adams LP for Sparsest Cut has an integrality gap of at least 2 — £(7), even after n 1 rounds. 

3.3 Derandomization 

In this section, we use the method of conditional expectations to derandomize our rounding algorithm, 
which allows us to efficiently find a cut (^4, ^4) with sparsity at most twice the LP value. We will think 
of the set A as being a {0, l}-assignment/labeling for the nodes in V, where i G A <^=^ A(i) = 1. 
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In the above randomized process, let Yij be the indicator random variable for whether the pair is 
separated. We showed that for G Eg, E[iy = yij, and for all other (i, j) G ( 2 ) , E[ly] > yij/2. Now 
if we let Z = ^ e cap e Y e be the r.v. denoting the edge capacity cut by the process and Z' = J2 s t ^ evn stY s t 
be the r.v. denoting the demand separated, then the analysis of the previous section shows that 

HZ] <2 £ e cap e y e 



E[Z'] ~ a 

(Recall that a was the "guessed" value of the total demand separated by the actual sparsest cut.) 
Equivalently, defining LP* := X] e ca Pe2/e> and 

LP* a 

we know that E[W] < 0. 

The algorithm is the natural one: for the root r, enumerate over all 2 k assignments for the bag V r , and 
choose the assignment A r minimizing E[W | A r ]. Since E[W] < 0, it must be the case that K[W \ A r ] < 
by averaging. Similarly, given the choices for nodes X' C X such that T[X'] induces a connected tree 
and K[W | {^4z}xeX'] < 0, choose any a G X whose parent b G X' , and choose an assignment A a for 
the nodes in V a \ V\, so that the new E[W \ {A x } xe x'u{a}] — 0. The final assignment A will satisfy 
K[W | {A a } ae x\ < 0, which would give us a cut with sparsity at most 2LP*/a, as desired. 

It remains to show that we can compute E,[W \ {A x } x ^x'} f° r any subset X' C X containing the root 
r, such that T[-X"'] is connected. Let V' = Ux^x'Ux be the set of nodes already labeled. For any vertex 
v £ V, let b(v) G X be the highest node in T such that v G Uu v \. If v is yet unlabeled, then b{v) ^ X' , 
and hence let £(v) be the lowest ancestor of b(v) in X' . In other words, we have chosen an assignment 
Ae( v ) for the bag Vgr v \. By the properties of our algorithm, we know that 

Pi[v G A | A e(v) ] = — . (3.12) 

Moreover, if u, v are both unlabeled such that their highest bags b(u), b(v) share a root-leaf path in T , 
then 

or i I * i x(V e{v) \J{u,v},A i(v) \j{u}) + x(V e{v) U{u,v},A l{v) \j{v}) 

Pr\u,v separated Ao(.,\\ = — — -— - 1 — , (3.13) 

ij x(V t{v)t A t(v) ) 

where £{v) = £(u) is the lowest ancestor of b(u), b(v) that has been labeled. If u, v are yet unlabeled, but 
we have chosen an assignment for a = lca(6(u), b(v)), then u, v will be labeled independently using (3.12). 
Finally, if u, v are unlabeled, and we have not yet chosen an assignment for a = lca(fe(u), b(v)), then the 
probability of u, v being cut is precisely 

Ex(V a ,A e(v) U U) 
— — — • Pr[(u,v) separated I V a labeled An v \ U U], 
uc Va \v e(v) < v ^ A m) 

where the probability can be computed using (3.12), since u,v will be labeled independently after 
conditioning on a labeling for V a . There are at most n°( fc ) terms in the sum, and hence we can compute 
this in the claimed time bound. Now, we can compute E[W | {^zjxeX'] using the above expressions in 
time n°( k ' , which completes the proof. 

3.3.1 Embedding into l\ 

Our algorithm and analysis also implies a 2-approximation to the minimum distortion t\ embedding 
of a treewidth k graph in time n°( k \ We will describe an algorithm that, given D, either finds an 
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embedding with distortion 2D or certifies that any l\ embedding of G requires distortion more than 
D. It is easy to use such a subroutine to get a 2 + o(l)-approximation to the minimum distortion l\ 
embedding problem. 

Towards this end, we write a relaxation for the distortion D embedding problem as follows. Given G 
with treewidth k, we start with the r-round Sherali- Adams polytope SA r (n) with r = 0{k\ogn). We 
add the additional set of constraints C ■ d(u, v) < y uv < D -C ■ d(u, v), for every pair of vertices u,v £ V. 
The cut characterization of l\ implies that this linear program is feasible whenever there is a distortion 
D embedding. Given a solution to the linear program, we round it using the rounding algorithm of the 
last section. It is immediate from our analysis that a random cut sampled by the algorithm satisfies 
Pr[(u,v) separated] G [y U v/%yuv\- 

Moreover, since the analysis of the rounding algorithm only uses equality constraints on the 

expectations of random variables, we can use the approach of Karger and Koller [KK97] to get an 
explicit sample space $7 of size \Q\ = rfi^ 1 that satisfies all these constraints. Indeed, each of the points 
uj G 0, of this sample space gives us a {0, l}-embedding of the vertices of the graph. We can concatenate 
all these embeddings and scale down suitably in time \Q\ ■ poly(n) to get an £i-embedding f : V —> M'^' 
with the properties that (a) \\f(u) - f(v)\\i = y uv for all (u,v) G E G , and (b) - f(v)\\i > y uv /2 

for (u,v) G (Y). Scaling / by a factor of C gives an embedding with distortion 2D. 

4 The Hardness Result 

In this section, we prove the Apx-hardness claimed in Theorem 1.3. In particular, we show the following 
reduction from the MaxCut problem to the Non-Uniform Sparsest Cut problem. 

Theorem 4.1 For any e > 0, a p- approximation algorithm for Non-Uniform Sparsest Cut on series- 
parallel graphs (with arbitrary demand graphs) that runs in time T(n) implies a (- — s)- approximation 
to MaxCut on general graphs running in time T(n ^)). 

The current best hardness-of-approximation results for MaxCut are: (a) the (j| + e)-factor hardness 
(assuming P / NP) due to Hastad [HasOl] (using the gadgets from Trevisan et al. [TSSW00]) and (b) the 
(acw — e)-factor hardness (assuming the Unique Games Conjecture) due to Khot et al. [KKMO07, 
MOO10], where a G w = 0.87856 ... is the constant obtained in the hyperplane rounding for the MaxCut 
SDP. Combined with Theorem 4.1, these imply hardness results of (j| — e) and (1.138 — e) respectively 
for Non-Uniform Sparsest Cut and prove Theorem 1.3. 

The proof of Theorem 4.1 proceeds by taking the hard MaxCut instances and using them to construct 
the demand graphs in a Sparsest Cut instance, where the supply graph is the familiar fractal obtained 
from the graph -ft^n- 3 The base case of this recursive construction is in Section 4.1, and the full 
construction is in Section 4.2. The analysis of the latter is based on a generic powering lemma, which 
will be useful for showing tight Unique Games hardness for bounded treewidth graphs in Section 5 and 
the Sherali- Adams integrality gap in Section 6. 

4.1 The Basic Building Block 

Given a connected (unweighted) MaxCut instance H = ([n],Ejj), let m = \Ejj\, and let mc(H) := 
max^ c r n ] |<9#(A)|. Let the supply graph be G'± = (Vi,Ei), with vertices V\ = {s,i} U [n] and edges 
Ei = Uj g [ n ]{{s, i}, {t, i}}. Define the capacities cap s i = cap t i = deg H (i)/2m. Define the demands thus: 
dem Sji = 1, and for i,j G [n], let demy = 1/j j}eE H / m (i- e -! hj have 1/m demand between them if 
{i,j} is an edge in H, and zero otherwise). Let this setting of demands be denoted D[. (The hardness 

3 The fractal for K2,2 has been used for lower bounds on the distortion incurred by tree embeddings [GNRS04], Euclidean 
embeddings [NR03], and low-dimensional embeddings in l\ [BC05, LN04, Regl2]. Moreover, the fractal for K2, n shows 
the integrality gap for the natural metric relaxation for Sparsest Cut [LR10, CSW10]. 
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results in Chuzhoy and Khanna [CK09] and Chlamtac et al. [CKR10] used the same graph G\, but with 
a different choice of capacities and demands.) 

Claim 4.2 The sparsest cuts in are s-t- separating, and have sparsity m/(m + mc(H)). 

Proof. For AC [n], the cut (A + s, A + t) has sparsity 

Eie[n] deg H (i)/2m 
^fPP+1 " +m < ' 

since | J2 i deg H (i) = m. The cut (A, A + s + t) has sparsity 

2Eig A degg(i)/2m > 2^ eA deg g (i)/2 > ^ 
\d H {A)\/m ~ J2ieA de gH(i) 

which is strictly worse than any s-t-separating cut. Hence the sparsest cut is the cut (A + s, A + 1) that 
maximizes |3#(.A)|. ■ 

Given a cm-vs-sm hardness result for MaxCut, this gives us a (1 + c)-vs-(l + s) hardness for Sparsest 
Cut. However, we can do better using a recursive "fractal" construction, as we show next. Before 
we proceed further, we remark that if we remove the s-t demand from the instance G[, we obtain an 
instance G\ with the following properties. 

Lemma 4.3 The instance G\ constructed by removing dem s j from G\ satisfies: 

• If H has a cut of size cm, then there is an s-t separating cut of capacity 1 that separates c 
demand. 

• Any s-t separating cut has capacity at least 1. 

• If the maximum cut in H has size sm, then every s-t separating cut has sparsity at least s^ 1 . 

• Any cut that does not separate s and t has sparsity at least 1. 

While G\ by itself is not a hard instance of Sparsest Cut, the above properties will make it a useful 
building block in the powering operation below. 



Figure 4.1: Base case of the construction G\ for n = 5. 
4.2 An Instance Powering Operation 

In this section, we describe a powering operation on Sparsest Cut instances that we use to boost the 
hardness result. This is the natural fractal construction. We start with an instance G\ = {V\ = 
{s,t} U [n], cap e , dem e ) of the sparsest cut problem. In other words, we have a Sparsest Cut instance 
with two designated vertices s and t. (For concreteness, think of the G\ from the previous section, but 
any graph Gi would do.) 

For t > 2, consider the graph Gi obtained by taking G\ and replacing each capacity edge e = (u, v) 
in G\ with a copy of G^_i in the natural way. In other words, for every e = (u, v), we create a copy 
G e p_i of Gi—i, and identify its vertex s with u and its t with v. Moreover, G$_* is scaled down by cap e . 
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Thus if edge / £ has capacity capj in G^_i, then the corresponding edge in has capacity 

cap e • capj; the demands in Gf_ 1 are also scaled by the same factor. In addition to the scaled demands 
from copies of Ge-i, Gg contains new level-i demands derrijj from the base graph G±. Note that this 
instance contains vertices of V\ in its vertex set and will have s and t as its designated vertices. 

The following properties are immediate. 

Observation 4.4 If G\ has n vertices and m capacity edges, then Gi has m e -~ 1 n vertices and m 
capacity edges. Moreover, if the supply graph in G\ has treewidth k, then the supply graph of Gi also 
has treewidth k. 

We next argue "completeness" and "soundness" properties of this operation. We will distinguish between 
cuts that separate s and t, and those that do not. We call the former cuts admissible and the latter 
inadmissible. 

Lemma 4.5 If G\ has an admissible cut (A, A) that cuts cap(A, A) capacity and dem(A, A) demand, 
then there exists an admissible cut in Ge of capacity (cap(A, A)) e that cuts dem(A, A) ■ (X)i=o ca P(A, 
demand. 

Proof. The proof is by induction on t. The base case i = 1 is an assumption of the lemma. Assume 
the claim holds for Gg-i. Let (^bj-i, ^bj-i) denote the admissible cut satisfying the induction hypothesis 
and let s £ A#_i. Recall that Gf is created by replacing the edges of G± by copies of G^_i. Define the 
cut At in the natural way: Start with Ai = A. Then for each e = (u, v) £ G\ such that u, v € A, we 
place all of in Af, similarly if u,v £ A then place all of G^_ 1 in A%. For (u,v) £ G\ such that 
u £ A, v £ A, we cut G f j,_ 1 according to ^Lj_i): i.e., the copy of a vertex x £ A^\ is placed in 

A£. Similarly, if u £ A, v £ A, we put the copy of x in Ag if x £ Ag-i- This defines the cut (Ap, Af). 

The capacity of the cut can be computed as follows: For each edge of G\ cut by A, the corresponding 
copy of Gi-\ contributes cap e • cap(^bj_i) = cap e • (cap(^4, A) l ~ l ) to the cut, where we used the inductive 
hypothesis for Ag_ 1 . For edges not cut by (A, A), the corresponding G\_ x is uncut and contributes 0. 
Thus 

cap(At,At) = E ee(j4 ,A) ca Pe • (cap(^4, A)^ 1 ) = cap(A, Af. 
Similarly, the demand from copies of G^i cut by Ai is exactly 

T,ee(A,A) cap e -dem(^_i,^_i) 

= cap(i4,A)-dem(A / _i,A < _i) 

= Ca p(A,A) ■ de m (A,A) ■ (EtocMAA) 1 ) 

= dem(A,A)-(Ejli 1 cap(A,A) < ). 

(The second equality is from the induction hypothesis.) Additionally, Ai cuts exactly dem(A, A) units 
of the level-£ demands. The claim follows by the summing the two. ■ 

Note that if G\ has an admissible cut (A, A) of capacity 1 that cuts dem(A, A) units of demand, then 
the above lemma gives us a cut of capacity 1 that cuts ^dem(A, A) units of demand. 

Now, for soundness analysis, we argue that if Gi has no "good" cuts, then neither does Ge- It will be 
convenient to separately argue about the admissible and inadmissible cuts. 

We will need the notion of "connected" cuts. Given a graph G = (V,E), call a cut (X, V\X) connected 
if the resulting components G[X] and G[V\X] are both connected graphs. Observe that for a connected 
admissible cut (A + s, A + t) in G#, along any s-t shortest path P, the vertices in P n (A + s) forms 
some prefix of P — this path is cut exactly once. 
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Lemma 4.6 ([OS81], Lemma 2.1(ii)) For any connected Sparsest Cut instance (G,D), there exists 
a sparsest cut that is connected. 

Proof. Let S be the sparsest cut in G such that Y2eed G {S) ca Pe is as small as possible. We claim that 
S must be connected. Suppose not, and let S% , . . . , S& be the partition of S into connected components 
with k > 3. Let cap e be the capacity on edge e G Eq and let dem e be the demand on edge e G Ed. Let 
$G,D be the value of the Sparsest Cut instance. Then 

, _ Eegd G (g) Ca Pe > Ef=lEega G (5 i ) Ca Pe 
G,D Eeedn(S) dem e " Ell Eee8 D ( Sl ) ^ ' 

Since S is a sparsest cut, each Si has sparsity at least $>g,d- It follows that in fact all Si's have the 
same sparsity. Thus for each i, 

E e g9 G (5 I ) ca Pe 

®G,D = ^ j • 

2-e 6 a D (Si) dem e 

Since G is connected, each of the EeedGC-S,) ca Pe quantities are positive. Moreover, since k > 3, there 
must be an i such that Eee<9 G (Si) ca P e is strictly smaller than EeG9 G (S') ca Pe- This, however, contradicts 
the definition of S, and the claim follows. ■ 

We now proceed to main technical result of this section. 

Lemma 4.7 Suppose that for some constant 7, G\ satisfies: 

• Any admissible cut (A, A) has capacity cap(A,A) at least 1. 

• Any admissible cut (A, A) cuts at most 7 • cap(A, A) demand. 

• Any inadmissible cut (A, A) cuts at most cap(A, A) demand. 

Then (Gg,Dg) satisfies: 

• Any admissible cut (A, A) has capacity cap(A,A) at least 1. 

• Any admissible cut (A, A) cuts at most ^7 • cap(A, A) demand. 

• Any inadmissible cut (A, A) cuts at most {{£ — 1)7 + 1) • cap(A, A) demand. 

Proof. The proof is by induction on t. The base case i = 1 is the assumption of the lemma. Suppose 
that the claim holds for G^_i. 

First, let (A£,Ai) be an admissible cut. Let (Ai,A\) denote the projection of this cut onto {s,t} U [n], 
i.e., A\ = Ai n ([n] U {s, t}). For each edge e G (Ai,A%), the cut Ag induces an admissible cut on 
G\_-y. This contributes at least unit capacity to the corresponding level- [l — 1) cut (by the induction 
hypothesis), and thus cap e • 1 to the cut (Ag, Ag) because of the scaling-down in the construction of Gg. 
Summing over all edges e G (Ai,Ai), we conclude that cap(Ag,Ag) is at least the capacity of (^4i,j4i) 
in G\. Using the fact that all admissible cuts in G\ have capacity at least 1, the first part of the claim 
follows. 

Next, we estimate the demand cut by (Ag,Ag). The total level I demand cut is at most 7 times 
the capacity of (Ai,A±) in G%, and hence by the argument above, contributes at most 7 cap(Ag, Ag). 
Moreover, if derri£_ 1 and cap^_ x denotes the total demand and capacity cut by Ai inside G|_ l5 then by 
the induction hypothesis, we have derri£_ 1 < (I — 1) • 7 • cap^_ r Since these demands and capacities 
are coming from disjoint sets of edges, we can add these inequalities to conclude that tiem(Ag,Ag) is at 
most 7cap(A^, Ag) + ^ e dem^_ 1 < (7 + (£ — 1) • 7) • cap(Ag, Ag). The second part of the claim follows. 

Finally, let (Ag, Ag) be an inadmissible cut with s, t Ag; by Lemma 4.6, we can assume it is connected. 
Let (A\,A\) denote the projection of (Ag,Ag) onto Gi. Our construction guarantees that A\ is either 
empty, or induces a connected cut in G\. In the former case, Ag induces an inadmissible cut in some 
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Gj_ 1 ; the demand and capacity cut by Ag equals those in this inadmissible cut of Gf_ t , so we can use 
the inductive hypothesis. Since both the demands and capacities are scaled by the same amount, this 
gives us the proof for the first case. In the latter case, for every edge e £ Gi cut by {A\,A\), the cut 
(Ai,Ai) induces an admissible cut in G\_ x . Let capj_ 1 and dem|_ 1 denote the capacity and demand 
cut by Ai inside G\_ v By the inductive hypothesis, each of these admissible cuts has at least unit 
capacity in the unsealed version of Gt-i, so the scaled-down capacity capj_ 1 > cap^ 1 . Summing over 
all cut edges, cap(A£, A#) > cap Gl (A\, A\). The level-£ demand cut by Ag is equal to the demand cut 
in G\ by A\\ by the last assumption this is at most cap Gl (A±, A\) < cap(At, Ae). Moreover, using the 
second part of the induction hypothesis, we conclude that for any e, dem|_ 1 < {£ — l)7cap|_ 1 . Since 
the demands and capacities contribution to dem|_ 1 and zap\_ x are disjoint for different edges e, we can 
add these inequalities to conclude that the total demand cut is at most cap^A^Ag) + ^2 e dem1_ 1 < 
cap(Ai, Ai)(l + (£ — 1)7), which completes the proof. ■ 

4.2.1 Putting It Together 

Let G\ be the instance defined in Section 4.1 and Gg be obtained by the powering operation starting 
with G\. Lemmas 4.3 and 4.5 imply that if H has a cut of size cm, then Gg has a cut of sparsity 
j-. Moreover, using Lemma 4.7 along with Lemma 4.3 shows that if H has max cut size at most sm, 
then the sparsest cut in Gg has sparsity at least 1+ ^_ l ^ s ■ Hence, a cm-vs-sm hardness for MaxCut 

translates to a > f (1 — ^r) hardness for Sparsest Cut. Taking £ = = Q(l/e) gives us a 

hardness of §(1 — e). 

Note that Lee and Raghavendra [LR10] show the integrality gap of the natural LP relaxation for Non- 
Uniform Sparsest Cut is 2 for series-parallel graphs; Chekuri, Shepherd, and Weibel [CSW10] give a 
different analysis of the integrality gap lower bound. Their instances are the graphs Gg above, but with 
K n as the MaxCut instance H. In hindsight, their gaps follow from the fact that the integrality gap 
of the LP relaxation of MaxCut on K n is 2. This theme will be revisited when we show an integrality 
gap for the Sherali-Adams LP using the Sherali-Adams integrality gaps for MaxCut. 

5 A Tight Unique Games Hardness 

In this section, we show that, assuming the Unique Games Conjecture, the Sparsest Cut problem is 
hard to approximate better than a factor of 2, even on bounded treewidth graphs. Specifically, for 
every constant e > 0, having a polynomial-time algorithm for every fixed value of treewidth that gave 
a (2 — e)-approximation to Sparsest Cut would violate the Unique Games Conjecture. 

The proof in this section first abstracts out a useful form of the Unique Games problem and builds a 
basic instance from it that shows a hardness of 3/2 — e. Then we use the powering ( "fractalization" ) 
operation from Section 4.2 to boost the hardness to 2 — e. 

5.1 A Convenient Form of Unique Games 

One standard form of the Unique Label Cover (a.k.a. Unique Games) problem is the following. We are 
given a bipartite graph B = (U,V,Eb)- There is a label set with d labels. Each edge (u,v) E Eb has 
an associated bijective map a u ,v '■ [d] — >• [d]. A labeling is a map from UL) V to [d], and satisfies an edge 
(u,v) G Eb if 

a UjV (label (it)) = label(u). 

The optimum of the Unique Label Cover problem is the maximum fraction of edges satisfied by any 
labeling. 

Conjecture 5.1 (Unique Games Conjecture) For any 77,7 > 0, there is a large enough constant 
d = d(r], 7) such that it is NP-hard to distinguish whether a Unique Label Cover instance with label size 



13 



d has optimum at least 1 — 77 or at most 7. 

It will be most convenient for us to consider non-bipartite versions of Unique Label Cover, where there 
is a general (multi)-graph H = (Vjj,Eh). For each e = (v,w) G Eh there is again a bijective map 
o~ e : [d] —> [d] and the goal is to find a labeling maximizing the number of satisfied edges. We call 
such a multigraph H a union of cliques if there exists a partition of Ejj into (edge-disjoint) cliques 
Ci, C2, . . . , C T for some r, i.e., each Cj is a complete graph on some subset Si C V#. (Recall that is a 
multigraph, so these sets may have more than single vertices in common, resulting in parallel edges.) 
We call a Unique Label Cover instance (H,{a e } ef zE H ,d) A-nice if 

• The edge set Eh is an (edge-disjoint) union of cliques C\, C2, . . ., C n where n := \Vh\- 

• Each clique Cj is over some subset S{ of size A. 

• Each vertex v G Vh lies in exactly A cliques. 

Note these properties mean that the degree of each vertex is exactly A(A — 1), where we count parallel 
edges. Moreover, the total number of edges in H is n{^). A Unique Label Cover instance is nice if it is 
A-nice for some A. (The use of such a Unique Label Cover instance is also implicit, e.g., in [KKMO07].) 

Lemma 5.2 Assuming the UGC, for any n > 0, there is a large enough constant d = d(n) such that it 
is NP-hard to distinguish whether a nice Unique Label Cover instance with label size d has optimum at 
least 1 — T), or at most r\. Moreover, this holds for A-nice instances where n < 1/2A. 

Proof. We can assume we have instances B = (U, V, Eb) of bipartite Unique Label Cover which are A- 
regular — all vertices in UL)V have degree A in B — where we cannot distinguish between (1 — <5)-versus-<5 
fraction of satisfiable constraints. We define a non-bipartite Unique Label Cover instance H = (V, Eh) 
on the set V as follows: For each u G U, add edges in Eh between all its neighbors. (Hence, the 
number of edges added between v,w G V equals the number of common neighbors they have.) Since 
each vertex u G U results in a clique being added among its A neighbors, the new instance is a union 
of n edge-disjoint A-cliques. Moreover, each vertex in V belongs to A cliques, equal to its degree in B. 
So the instance is A-nice. Finally, the bijection constraints are a' vw := o~ uw o a~^, with label set [d]. 

Suppose the bipartite instance B was 1 — 5 satisfiable. Let label : U U V — > [d] denote a labeling 
that satisfies (1 — 5) fraction of the constraints. Let 5 U denote the fraction of constraints incident 
on u that are violated by label so that ^2 U 5 U < n5. In the clique corresponding to u, this labeling 
violates at most (5 U A) • (A — 1) constraints . Thus the total number of violated constraints is at most 
5 U A(A — 1) < n<5A(A — 1) = 25n(^). Thus at least 1 — 25 of the constraints in H are satisfiable. 

Conversely, suppose a rj fraction of constraints in the union of cliques instance H was satisfied by a 
labeling \. We would like to extend this coloring to the vertices of U so that at least an rj fraction of 
the constraints in B are satisfied. To this end, consider any u G U and restrict attention to the clique of 
edges added among the neighbors of u. Suppose r\ u fraction of the ( 2 ) edges in this clique were satisfied 
in H by x- Note that E u [ry u ] = n. Suppose these %( 2 ) satisfied edges form t connected components, 
with sizes ki > /c2 > • • • > kt- The maximum number of satisfied edges within these components is 
Yll=i ( 2O ' wn i cn must be at least rj u ( 2 ) . This implies that ^ ( k ^) > rj u ( 2 ) , and in turn, k\ > r] u ■ A, In 
other words, there exists a connected component of satisfied edges with r] u A vertices. Pick any vertex 
v in this component and set x( u ) '■= °~uv (x( v )) ■ Note that this label for u satisfies not only the edge 
(u,v) G Eb, but also the edges (u,w) for w lying in this connected component. (This follows by the 
uniqueness of the constraints.) Do this independently for each u G U. The total fraction of constraints 
satisfied by this extension of % to a labeling of U U V is now E U [?7 U ] = rj, which completes the proof. 

Finally, observe that in any A-nice instance, we can find a matching of size at least 1/2A and satisfy 
all these edges; hence the parameter 77 must be below 1/2 A. m 

In the next section, we give a reduction from such A-nice instances of Unique Label Cover to Sparsest Cut 
on constant treewidth graphs. A little notation will be useful: for a vertex x = (x\,X2, ■ ■ . , Xd) G {—1, l} d 
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and a permutation a, define a(x) to be the vector 

(^-1(1)) ^-1(2)1 ■ • • i x a- 1 (d)) 

and define x = (—x±, —X2, ■ ■ ■ , — x^)- 
5.2 The Basic Instance 

Consider a A-nice Unique Label Cover instance H = (Vh,Eh) with bijections {a u ,v} ( u ,v)eE H an d label 
set d. Let n := \Vh\ be the number of vertices in H and m := \En\ = be the number of edges in 
H. Let N := n ■ 2 d and M := m ■ 2 d ; hence M/N = m/n = (£). 

The Sparsest Cut instance (G, D) on the new set of nodes V is the following: 

• Nodes. Take a cube Q v = {—1, l} d for each v G Vff. Add two new nodes s,£. This is the new 
set of nodes V of size n ■ 2 d + 2 = N + 2. We will use rc £ { — 1, l} d to denote cube nodes (i.e., 
those in V \ {s, t}, and will use the notation x v to indicate that x is a cube node in cube Q v . 

• Supply Edges. Add edges of capacity 1/N from s to each node in U v& y H Q v , and the same from 
t to these nodes. There are 2A^ such edges, which we call star edges. 

Add all the hypercube edges, but with capacity a/N for some parameter a > 0. There are 
n ■ d2 d ~ l = dN/2 cube edges, which have total capacity ad/2. (Think of a as a small quantity.) 

• Demand Edges. For each edge e = (y, w) 6 Eh and for each x G {—1, l} d , add a demand edge 
with demand 1/M between x G Q v and a vw (x) G Q w . (If a vw were the identity permutation, 
then we would add a demand edge between each node in Q v and its antipodal node in Q w .) Thus 
there is a total demand of 2 d /M for each edge e G and hence a total of m • 2 d /M = 1 of such 
demand. Observe that there are A(A — 1) demand edges {x v , •) incident to each node x v G Q v for 
each v G Vh- 

This instance will be a good starting point for the powering operation of Section 4.2. Recall that a cut 
is admissible if it separates (s,t), and is inadmissible otherwise. We will show that in the good case, 
there is a sparse admissible cut, whereas in the bad case, all admissible cuts have much higher sparsity. 
Additionally, we will prove a (weaker) lower bound on the sparsity of inadmissible cuts. 

Intuitively, the factor of two comes from the following facts: if we connect two hypercubes with demands 
between x in the first hypercube to x in the second, choosing the same dictator cut on both hypercubes 
cuts all demand pairs, while choosing different dictator cuts on the the two hypercubes cuts only half 
the demand pairs. Thus restricted to dictator cuts, we get a gap of two. How do we exclude cuts that 
are far from dictators 4 ? Here we use the fact that sparse cuts should cut few supply edges within each 
hypercube. Freidgut's junta theorem tells us that cuts that do not cut much more than a dictator cut 
in a hypercube, are close to juntas, which is sufficient to be able to "decode" a sparse cut into a good 
assignment to the unique games instance. 

We start by recording some basic properties of G, which will prove useful for the powering operation. 

Observation 5.3 Let G be the network defined above. 

• The total capacity in the network is (1 + ad/2), and the total demand is 1. 

• The treewidth of the supply graph is at most 2 d , the size of each hypercube. 

• Any admissible cut has capacity at least 1. 

4 This is where we are able to improve on the MaxCut UG-hardness of [KKMO07, MOO10], where the noise operator 
is applied to such a construction, and majority-is-stablest is used to exclude cuts far from dictators. 
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5.2.1 Completeness 



Lemma 5.4 If there exists a labeling f : V — > [d] satisfying at least 1 — r\ fraction of the Unique Label 
Cover instance, then there exists a cut in (G, D) of capacity at most (1 + a/2) that cuts at least (1 — rj) 
demand. 

Proof. Consider the set A that contains s and, for each v G Vh, contains {x G Q v \ Xj^ = 1}. This 
cut separates each node in V \ {s, t} from either s or t, and hence separates N of the star edges to 
cut total capacity N • 1/N = 1. Moreover, it cuts exactly 2 d ~ 1 edges from each hypercube, and hence 
a/N ■ n2 d ~ 1 = a/2 capacity in the cube edges. This gives a total of (1 + a/2) capacity cut by (^4, ^4). 

There are at least (1 — rj)m edges (v, w) G Eh with cr vw (f(v)) = f(w). Consider one such edge (v,w) 
and some x G Q v ; say = b G {—1,1}. Then its demand edge corresponding to (v , w) goes to 

o~vw(x) £ Qw, whose f(w) th coordinate contains 



Hence, A cannot contain both x G Q v and o~ vw (x) G Q w and thus cuts all 2 d demands corresponding to 
(v,w) G Eh- The total demand cut is therefore at least 

(1 - rj)m ■ 2 d /M = (l-rj). 

This completes the proof. ■ 

5.2.2 Soundness 

Theorem 5.5 For e > 0, let a = e 2 . If the resulting Sparsest Cut instance has an admissible cut (A, A) 
in (G, D) of sparsity less than (2 — As), then there exists a labeling for the Unique Label Cover instance 
that satisfies at least rj' := (e/2 — 1/ A)/2°( £ > fraction of the edges in H . 

Proof. A road- map of the proof: We first show that a large fraction of the cubes are "good", i.e., for 
most of the cubes, very few of the supply edges are cut. Using Freidgut's Junta Theorem, these good 
cubes are close to being "juntas", i.e., each contains a small number of influential coordinates. Finally, 
for a large amount of the demand to be separated using these juntas, there is a non-trivial number of 
edges in H that have end-points whose juntas share a common coordinate, so a randomized rounding 
procedure gives the claimed good labeling for H. 

Claim 5.6 The cut (A, A) must separate more than (i + e) demand. 

Proof. By Observation 5.3, (A, A) has capacity at least 1. For the sparsity to be less than (2 — 4s), 
(A, A) must cut at least 1/(2 — 4e) > (i + e) units of demand. ■ 

Call a vertex v G Vh good if the number of cube edges from Q v cut by (A, A) is at most ^2 d ~ 1 ; we also 
say such a cube is good. (Note that a dimension cut — a.k.a. a dictator — would cut exactly 2 d ~ 1 edges, 
so good cubes do not have "too many more" cut edges than a dictator.) 

Claim 5.7 There are at most (e/8)n bad vertices in Vh- 

Proof. Suppose not: Then the number of cube edges cut is strictly more than (e/8)n • (32/ae)2 d ~ l , 
each of capacity a/N. This gives a total capacity of more than 2. But the total demand in the instance 
is 1, which would imply the sparsity of (A, A) is more than 2, a contradiction. ■ 
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The rest of the argument shows that a cut separating (1/2 + e) units of demand (by Claim 5.6) that is 
good on most cubes (by Claim 5.7) can be decoded to a labeling for the Unique Label Cover instance. 
(Henceforth, we will not worry about the supply edges, etc., and will just consider the cubes and the 
demands.) 

Let us view the cut (A, A) as a function /" : U v( zv H Qv — > { — 1> !}• We now perform two small changes 
to this function. Firstly, for any bad vertex v G Vh, let us change /" on the nodes of Q v to be identically 
equal to 1. Note that the new function (which we call /') might separate less demand, but since we 
changed the function value on an e/8 fraction of the cube nodes and the instance is regular, the demand 
separated must still be (1/2 + e - 2 ■ e/8) = (1/2 + 3s/ A). 

Secondly, we use the following rephrasing of Freidgut's Junta Theorem [Fri98] that has been used in the 
context of hardness for cut-related problems (see, e.g., [CKK+06]): 

Theorem 5.8 (Freidgut's Junta Theorem) If a function f : {— l,l} d — > { — 1,1} cuts at most 
c2 d ~ l edges of the cube {—1, l} d , then it is e-close (i.e., differs in at most e2 d points) from a function 
f : {—1, l} d — > {—1, 1} that is a exp{0(c/e)} -junta. 

(Recall that the function / is a K -junta if there is a set S C [d] of size at most K, such that for any 
x G {— l,l} rf , specifying the coordinates of x that lie within S determines the value f(x). I.e., / is 
constant on the subcubes obtained by fixing the bits in S and running over all the other bits.) 

Using Theorem 5.8 for each good cube, we replace the function /' restricted to that cube by its e/8- 
close .fT-junta, where K := exp{0(l/e 2 a)} = exp{0(e -4 )}. Here we used the fact that the number 
of edges cut in a good cube is at most ^2 d ~ 1 the assumption a = e 2 . This redefined function, which 
we call /, differs from /' on at most e/8 fraction of all the n ■ 2 d cube nodes, which might result in 
at most 2 • e/8 = e/4 fraction of the demand no longer being separated. This still leaves at least 
1/2 + 3e/4 — e/4 = 1/2 + e/2 demand separated. Moreover, now the newly redefined function / is a 
-fC-junta on each of the cubes — on the bad cubes by the first redefinition and on the good cubes by the 
second — and / still separates (1/2 + e/2) units of demand. 

For each v G Vh, define the set J v C [d] to be the set of K most influential coordinates of / restricted 
to the cube Q v : since $\q v is a if-junta, specifying {x^}i £ s fixes the value f(x v ). Secondly, call an edge 
(v,w) G Eh compatible if cr vw (J v )nJ w ^ 0, i.e., if f\Q v and /|q to "share" an influential coordinate (after 
applying the right permutation). Observe that if an edge is compatible, then assigning each vertex v 
a label randomly chosen from its set J v would satisfy each compatible edge with probability at least 
1/K 2 . The following lemma is now the final piece in the argument. 

Lemma 5.9 The number of compatible edges (v,w) 6 Eh is at least (e/2 — 1/A)m. 

Proof. Consider all the cubes Q v . Recall that we don't care about the supply edges for this argument, 
only the demand edges. For each (v,w) G Eh, these demand edges give a matching between the nodes 
of Q v and Q w ; the union of all these matchings gives us all the demand edges. Finally, / gives us a 
{ — 1, l}-coloring of the nodes of the cubes and separates (1/2 + e/2) fraction of these demand edges. 

For each cube Q v , collapse all the nodes x v whose values on the coordinates in J v are the same to 
get a cube Q v isomorphic to {—1, 1} K . (Each new node x v in Q v comprises of 2 d ~ K nodes collapsed 
together.) Moreover, since any two nodes collapsed together agree on their /-value, the number of 
separated demand edges in the resulting multigraph remains unchanged. 

Suppose (v, w) is not compatible. We claim that the 2 d demand edges between Q v and Q w now form 
a complete bipartite multigraph between Q v and Q w , with 2 d ~ 2K demand edges going between each 
x° G Q v and G Q w . For simplicity, assume that o~ vw is the identity map, so incompatibility means 
that J v n J w = 0, and imagine that J v = {1,2, . . . , K} and J w = {K + 1, K + 2, . . . , 2K}. A demand 
edge that used to go between (&i, £>2, ■ ■ • , &d) £ Qv an d (— &i, — b%, . . . , — bd) G Q w now goes between 
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(b\, 62, • • • , bx) £ Qv and (—bx+i, • • • , — &2if) E Qw There are 2 d 2K of these edges, and they are the 
only ones. This proves the claim. 

Now suppose there were no compatible edges at all. Then for each edge (v, w) in H, we would add 
a complete bipartite multigraph between the 2 K nodes in Q v and Q w . This would be exactly the 
multigraph H x I 2 k (where Ie is a graph on i vertices and no edges at all) , but with each edge replicated 
some number 2 d ~ 2K number of times. It is easy to show that, because H is a union of A-cliques, / 
can cut at most 1/2 + 1/A fraction of the demand edges; note this is precisely where we use the 
union-of-cliques structure of H. The simple proof is in Claim 5.11. 

However, we claimed that / cut l/2 + e/2 fraction of the edges, so there must be at least one compatible 
edge. Observe that the number of cut edges behaves in a Lipschitz fashion as we change the number 
of compatible edges: Each compatible edge (v, w) E Eh means the cut may be higher by at most an 
additive 2 d amount (since this is the total number of demand edges corresponding to an H-edge), which 
is a \jm fraction of the total demand. If we want the fraction of demand edges cut to increase by 
e/2 — 1/A, and each compatible edge can increase this additively by at most 1/m, we have at least 
(e/2 — 1/A)m compatible edges. ■ 

Choosing a label for each v £ Vh randomly from J v and using Lemma 5.9, the expected fraction of 
satisfied edges in Eh is at least (e/2 — 1/A)/K 2 . Noting that K = exp{0(e -4 )} completes the proof of 
Theorem 5.5. ■ 

Finally, we show that inadmissible cuts cannot have sparsity better than 1 
Claim 5.10 Any inadmissible cut (A, A) satisfies dem(A,A) < cap(A,A). 

Proof. By Lemma 4.6, we can assume (A, A) is connected and assume w.l.o.g. that s,t ^ A. Then 
A is a connected subset of one cube Q v . This means dA contains all star edges adjacent to A, so 
cap(A, A) > 2\A\ ■ 1/N. Moreover, for each x v G A, all the A(A — 1) demand edges incident to x v are 
cut, so the total demand cut is A(A - 1) • \A\ ■ 1/M = 2\A\/N. m 

5.2.3 MaxCut on Lifts of K n 

The following simple lemma was useful in the analysis of Theorem 5.5. 

Claim 5.11 Let H be a multigraph that is a union of cliques, each on exactly A > 2 vertices. Let 
In = ({1,2, . ..,£}, 0) be a graph of £ vertices with no edges. For each i £ Z>i, every cut in the graph 
H x It contains at most g + 2(A-i) — 2 A f rac -tion of the edges. 

Proof. For I = 1, H x Ii = H. Consider any A-clique in H; any cut (i.e., a two-coloring of the vertices) 
separates at most 

1 1 

+ 



"A" 




A 












2" 




2" 





2 2(A-1) 

edges. Hence, any cut separates at most the claimed fraction of edges in each clique, and the fact that 
H is a union of (edge-disjoint) cliques gives the result for I = 1. 

For general I case: For each v, let b v be the fraction of its t copies colored blue. Color v blue with 
probability b v and red with probability 1 — b v . This gives a coloring (i.e., cut) of the vertices of H. For 
any edge (u, v) in H, this coloring cuts I 2 ■ (b u (l — b v ) + (1 — b u )b v ) edges in H x 7^, which is exactly I 2 
times the probability of (u, v) being cut by the random coloring in H. Hence the fraction of edges cut 
in H x Ii is exactly the expected fraction of edges in H cut by the random coloring. But the argument 
above implies that each 2-coloring of H cuts at most 1/2 + 1/2(A — 1) fraction of its edges, which lower 
bounds the expectation and proves the claim. ■ 
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5.3 Putting It Together 



To boost the hardness result from Section 5.2, we will use the powering operation on the instance defined 
above. Intuitively, our proof gave us a 2 — e gap as we long as we could ignore the inadmissible cuts. 
The powering operation decreases the sparsity in both the good and the bad cases. However, it ensures 
that even inadmissible cuts induce admissible cuts at lower levels and thus cannot have much better 
sparsity. 

Let G\ be the instance from Section 5.2. Then Observation 5.3, Theorem 5.5, and Claim 5.10 can be 
summarized as: 

Lemma 5.12 Suppose that there is no labeling f : V — > [d] satisfying an rj fraction of the unique label 
cover instance. Then there exists rj := rf{s,rj) such that the graph G% satisfies: 

• Any admissible cut (A, A) has capacity cap(A,A) at least 1. 

• Any admissible cut (A, A) cuts at most + rj') ■ cap{A,A) demand. 

• Any inadmissible cut (A, A) cuts at most cap(A, A) demand. 

Let Gt be the instance applying the powering operation to the instance G±, and let £ = 4/e and a = e 2 . 
Using Lemma 5.4 and Lemma 4.5, a straightforward calculation shows: 

Lemma 5.13 If there exists a labeling f : V — > [d] satisfying at least 1 — r? fraction of the unique label 
cover instance, then there exists a cut (Ai,A$) in Gi such that cap(A, A) < (1 + §) < 1 + 4e and 
dem(A,A)>£(l-ri). 

Hence, in the "yes" case, the sparsity is at most 

(1+jf) 

e(i-nY 

On the other hand, Lemma 4.7 implies that: 



(5.14) 



Lemma 5.14 Suppose that there is no labeling f : V — >• [d] satisfying an r/ fraction of the unique label 
cover instance. Then the graph Gi satisfies: 

• Any admissible cut (A, A) cuts at most + rj') ■ cap(A, A) demand. 

• Any inadmissible cut (A, A) cuts at most {£{\ + rf) + ^)cap(A, A) demand. 

In both these cases, the sparsity is at least 

> (5 15) 

l{\ + rf) + \ ~ l(\ + rf + e/2)- 

From (5.14) and (5.15), we conclude that the hardness is at least (i+^^i+^V+e) — ^ — 5e — 77 — 2r/). 
Thus for any e' > 0, we can pick e small enough to get a 2 — e' hardness. 

6 A 2 — e Integrality Gap for Sherali- Adams 

In this section, we show that how to translate Sherali- Adams integrality gaps for the MaxCut problem 
into corresponding Sherali-Adams integrality gaps for Non-Uniform Sparsest Cut. 

6.1 The Sherali-Adams LP and Consistent Local Distributions 

We begin by recording a standard result stating that an r-round Sherali-Adams solution is essentially 
equivalent to the existence of a collection of consistent "local" distributions {T>s}\s\< r - 
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Theorem 6.1 There exists a (x,y) E SA r (n) if and only if for every S C V of size at most r, there 
exists a distribution T>s over subsets of S and these distributions satisfy the following property: For any 
Q C S and for every A C Q, 

V Q ({A})=V s ({B:BnQ = A}). (6.16) 

Proof. Assume we have (x,y) G SA r (n). For a set S C V such that |5| < r, we define such that 
£> 5 (T) = x(5,T). By (2.4) and (2.2), all x(S,T) > and Y,sct x ( S , T ) = so this is a well-defined 
distribution. We now need to show that these distributions satisfy (6.16), i.e., we need to show that 

x(Q,A)= Y, X & B )- 

BCS 
BnQ=A 

This follows directly from Lemma 2.1: 

x(S,B) = Y, x(QU(S\Q),B'uA) 

BCS B'CS\Q 
BnQ=A 

= x(Q,A). 

On the other hand, assume we have distributions T>$ for all S C V such that |5| < r. Set x(S,T) = 
T>s{T). The fact that T>s is a distribution implies that ^2tcs^s(T) = 1 and the non- negativity 
constraints are satisfied. Finally, (6.16) implies that the consistency constraint 

V S {T) = V S+U {T) + V S+U {T + u) 

is satisfied. ■ 

6.2 The Integrality Gap Instance 

In this section, we show that the integrality gap of the LP for Sparsest Cut remains 2 — e, even after 
polynomially many rounds of Sherali- Adams. Recall the construction from Section 4: Given a connected 
unweighted instance H = (V, Eh) of MaxCut, it produces an instance Gi with vertices Vi and supply 
edges Ei, such that the sparsest cut in Gg is related to the max cut in H. We use the same construction 
here, but instead of starting with a hard instance H of MaxCut, we start with a MaxCut instance 
exhibiting an integrality gap for r rounds of Sherali- Adams. 

We first show that there exist "local" distributions satisfying the conditions of Theorem 6.1 for all 
subsets of the vertices of Gg of size at most 0(r) This implies the existence of an 0(r)-round Sherali- 
Adams solution for the Sparsest Cut instance. We then calculate the value of this fractional solution for 
Gg and relate it to the integral optimum. This result is a natural extension of those of [LR10, CSW10], 
who showed a gap of 2 for the basic LP also on the fractal for i^2,n using the fact that the complete 
graph K n exhibits an integrality gap for the basic MaxCut LP. 

Lemma 6.2 Consider an unweighted MaxCut instance H for which we have an r-round Sherali- 
Adams solution and Gg constructed from H . For any T C V7 of size at most r, there exists a distribution 
T>T over subsets of T such that for any Q QT and for every ACQ, (6.16) holds. 

Proof. Let us first define these local distributions. By Theorem 6.1, an r-round Sherali-Adams solution 
for the MaxCut instance H gives, for each subset R C Vh of size at most r, a probability distribution 
J-r over subsets of R. Moreover, these local distributions J-r satisfy the consistency constraints (6.16). 
We will use this set of distributions {J^r} Rcv H -\R\<r to define another set of distributions {V T}TCVr-\T\<r 
which also satisfy the consistency constraints (6.16). 

Recall that Gg is obtained by taking a copy of Gi = (Vi, E\) and then replacing each edge e G G\ by a 
copy of instance Gg^\ (which we call G\_^). To define the distribution for T C Vg with \T\ < r, we first 
extend T to a set T' in the following way: 
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(i) Set T = T. 

(ii) Add s and t to T'. 

(iii) For v £ V\, do the following: 

(a) If T S1) :=Tn is not empty, add v to X" and recurse on Gf^y 

(b) If T„t := T n is not empty, add v to T' and recurse on G v ^_ v 

Let T[ be the vertices of T' that are in Gi, i.e., T[ = T' D V\, and let Tg be all vertices of T' in the 
instance of Gi-\ corresponding to edge e in G\. 

The distribution V>t is given by the following recursive process that takes an instance Gi and a set 
T £ Vi and outputs a random subset ICT: 

(i) Put s in X; t will never be in X. 

(ii) Draw a subset Y from J 7 ^./. With probability 1/2, set 1 <- X U 7. With probability 1/2, set 

x <- iu (r{\y). 

(iii) For each vertex d e 7i, do the following: 

(a) If v £ X, set X <- X U 2^ and recurse on Gf_ x ,T' vt . 

(b) If v £ X, set X <- X U and recurse on Gf_ x ,T' sv . 

(iv) Note that X C T', so output X n T. 

Since we make draws from the probability distributions J 7 ^, we should ensure that these distributions 
are well-defined. In particular, since T[ is a subset of T 1 which could be larger than \T\ = r, we need 
to show that the size of T[ is at most r. Indeed, if T[ contains some vertex x not in T, this vertex has 
been added due to some y £ T in G e t _ x and hence can be charged to y £ T. This finishes the definition 
of the local distributions T>t- 

It remains to show that for any A C Q C T, the consistency condition (6.16) holds. Let us introduce 
some notation: For sets Y C X, let Prx (pick Y) indicate the probability that Y is chosen from the 
distribution T>x- Using this notation, (6.16) is the same as showing that for T C Vi of size at most r 
and any A C Q C T, 

Pr(pick A) = Pr(pick B). (6.17) 

Q _BCT:BnQ=A T 

It is easy to see that it suffices to prove this for the case where T = Q + q for some q € Vi \ Q. In this 
case, things simplify to showing that for every ACQ, 

Pr(pick A) = Pr(pick A) + Pr(pick (A + q)). (6.18) 

To prove (6.18), it then suffices to show 

Pr(pick A) = Pr(pick A) + Pr(pick (A + q)) (6.19) 

for any ACQ'. Recall that even though \T \ Q\ = 1, the extended sets T" and Q' may differ in more 
vertices. 

We proceed by induction on £. In the base case, consider G\. If q is s or t, this is trivial. Otherwise, 
the claim holds by the consistency properties of the {J-} distributions. In the inductive case, we assume 
(6.19) holds for GV_i and want to show that this claim holds for Gi. There are two subcases: q 6 Vj. and 
q £ V\ . If q £ V\ , the sets Q' and T' only differ on vertices in V\ and the claim holds by the consistency 
of the {J 7 } distributions. 

Otherwise, q £ V\. Let Q[ = Q' n V\ and A\ = An V\. We denote by V[_ ± the vertices of G\_ v Let A e 
be j4 n Vf_ 1 and let Q' e = Q' Pi for any edge e in Gi. Because our selection process is independent 
on each of the Vf^s given its choice in V\, we have that 

Pr(pick A) = Pr(pick A{) ]^[ Pr(pick A e \ pick A\) 

Q' Q[ egGi Q'e 
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and 

Pr(pick A) = Pr(pick Ax) Pr(pick A e | pick Ax 



eeGi 



Since q ^ Vx, Q[ = T[. Also, since T = Q + q, observe that Q' e and T' e must be the same for all edges e 
of Gi except for one. Call this edge e*. These two facts imply that 

Pr(pick A) = Pr(pick Ax) Pr(pick A e * | pick Ax) \\ Pr(pick A e | pick Ax). (6.20) 



Qi 



Similarly, we know that 



Pr(pick (A + q)) = Prfpick Ax) Pr(pick (A e * + q) I pick Ax) TT Prfpick A e I pick Ax). (6.21) 



By the inductive assumption, we know that 

Pr (pick A e *) = Pr(pick A e * \ pick Ax) + Pr(pick (A e * + g) | pick Ax). 

Adding (6.20) and (6.21) therefore gives us 

Pr(pick A) + Pr(pick (A + q)) = Pr(pick Ax) Pr(pick A e \ pick Ax) = Pr(pick A). 



Now, applying Theorem 6.1 to the MaxCut instance from Lemma 6.2, we get an r-round Sherali- 
Adams solution y for Sparsest Cut. Recall that we set y e = Dnj\({{i}, {j}}), which is the probability 
over the distribution that exactly one of the endpoints is chosen. We analyze the Sparsest Cut 

objective function value of this fractional solution y next. 

Lemma 6.3 Let the value of the r-round Sherali- Adams relaxation solution z for the MaxCut instance 
on H implied by the distributions {J~s} be Yl e eE H z e = cm. Then the sparsity of the r-round solution 

V is rc- 
Proof. Let capjj be the capacity of edge {i,j} and dem^ be the demand on edge {i, j} in the instance 
Ge. Let E c t be its capacity edges and Ef its demand edges. 

We begin by proving that ^ e eE c ca Pe2/e = 1- First, we claim that y e = 2~ e for any capacity edge e. 
By the symmetry introduced in Step ii(c) of the definition of T>t, a capacity edge of Gx is cut with 
probability 1/2, so y e = 1/2 for any such edge e. Using this fact, a simple induction then suffices. 
Second, we claim that J2 e &Ef ca Pe = 2 ■ Since J2 e &E^ ca Pe = 2, a simple induction again gives us the 
desired result. Combining these facts, we get ^ee-Ef ca Pe2/e = 1- 

Next, we show that YleeE D dem e y e = £c. The proof is again by induction on t. The base case is Gx, 
where the value is c because y e = z e for e E Eh and all demands are 1/m. For the induction step, 
assume ^2 eeE D dem e y e = {£— l)c. For each e G Ef, the corresponding G\_ x contributes \{£ — l)c- cap e 

by the inductive hypothesis and the fact that y e > = \y e » for all e' G Eg and e" £ Ef_ v Summing over 
all e G E\ gives (£ — l)c. By the base case, level £ demands contribute c, giving a total of £c demand 
cut. ■ 

Given the above construction, we can now prove the main theorem of this section, which allows us 
to convert Sherali-Adams integrality gaps for MaxCut to Sherali-Adams integrality gaps for Sparsest 
Cut. 
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Theorem 6.4 Given an unweighted MaxCut instance H on n nodes and m edges with a max cut of 
size sm and an r-round Sherali- Adams solution of value at least cm, for any constant e > there exists 
a Sparsest Cut instance G with an r-round Sherali- Adams integrality gap of (c/s) — e, such that the size 
of G is re c 



7 0(c/s 2 £) 



Proof. We will set G to be the graph Gg constructed from base instance H for some value of I to be 
chosen later. By Lemma 6.2, there exists an r-round Sherali-Adams solution y whose sparsity is j- by 
Lemma 6.3. By calculations as in Section 4.2.1, the actual sparsest cut value for Ge is at least 

This gives us an r-round Sherali-Adams integrality gap of 1+ ^l 1 ^ s > § — e for £ = -J^ . ■ 

Plugging in the Sherali-Adams integrality gap instance for MaxCut due to Charikar et al. [CMM09] 
gives us the following corollary: 

Corollary 6.5 For every e > 0, there exists 7 > such that the integrality gap of the Sparsest Cut 
relaxation is 2 — e even after n 7 rounds of Sherali-Adams. 

Proof. From [CMM09, Theorem 5.3], we have that for any e' > 0, there exists 7' > and an unweighted 
MaxCut instance with M edges and A nodes such that the optimal integral solution is at most 
M(l/2 + e'/6) and the LP value after A 7 rounds of Sherali-Adams is at least N(l — e'/6). From this, 
we get c/s > 2 — e'. Now Theorem 6.4 allows us, for any e" > 0, to obtain a Sparsest Cut instance 
with integrality gap 2 — e' — e" after A 7 rounds of Sherali-Adams. Setting e' = e" = e/2 gives us the 
integrality gap of 2— e. Moreover, the size of the new instance is n = , so setting 7 = 0(s 2 ej'/c) 

completes the proof. ■ 

7 Conclusions 

We show how to use the Sherali-Adams hierarchy to get a factor-2 approximation for the Non-Uniform 
Sparsest Cut problem on treewidth-A; graphs in time n olyk \ (This also gives 2°( v/ ^ ) -time 2-approximation 
algorithms for Sparsest Cut on minor-free graphs.) We also show that the Non-Uniform Sparsest Cut 
problem is as hard as the MaxCut problem, even for treewidth-2 graphs, which gives us the best 
NP-hardness known (even for the unconstrained problem). Assuming the UGC, this gives a hardness of 
1/0.878 — e for these series-parallel graphs. For graphs of large constant treewidth, we show a Unique 
Games hardness of 2 — e, which matches our algorithm. Finally, we demonstrate an integrality gap of 
2 — e for Sherali-Adams relaxations after a polynomial number of rounds, even for treewidth-2 graphs. 

Many research directions remain open. Among them are getting better hardness results for Non- 
Uniform Sparsest Cut, both for restricted graph classes and for the general problem, getting polynomial- 
time 0(l)-approximation algorithms for planar or minor-closed families (using LP/SDP hierarchies or 
otherwise), and making progress on the embeddability conjecture from [GNRS04]. 
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